1001 Paraphrases: Incenting Responsible Contributions in Collecting Paraphrases from Volunteers

نویسنده

  • Timothy Chklovski
چکیده

A variety of applications can benefit from broad and detailed repositories of linguistic and world knowledge. An emerging approach to acquiring such repositories is to collect them from volunteer contributors. To increase the volume of contributions, some deployed systems for collecting volunteer-contributed knowledge offer recognition or prizes to those who provide the highest volume of contributions. However, rewarding for volume alone can encourage irresponsible contributions by unscrupulous participants. In this paper, we present an approach to collection from volunteers which incents responsible contributions. Rather than asking contributors to simply enter knowledge, our approach is to collect additional answers by asking contributors to guess partially obfuscated answers. To test the approach, we have implemented an online game, 1001 Paraphrases (http://aigames.org/paraphrase.html), and deployed it to collect 20,944 entries paraphrasing 400 statements. We present preliminary observations and lessons learned on the success of the approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

External Plagiarism Detection based on Human Behaviors in Producing Paraphrases of Sentences in English and Persian Languages

With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique and existing methods suffer from lack of producing accurate queries, Precision and Speed of retrieved result...

متن کامل

UCD-PN: Selecting General Paraphrases Using Conditional Probability

We describe a system which ranks humanprovided paraphrases of noun compounds, where the frequency with which a given paraphrase was provided by human volunteers is the gold standard for ranking. Our system assigns a score to a paraphrase of a given compound according to the number of times it has co-occurred with other paraphrases in the rest of the dataset. We use these co-occurrence statistic...

متن کامل

Chinese Whispers: Cooperative Paraphrase Acquisition

We present a framework for the acquisition of sentential paraphrases based on crowdsourcing. The proposed method maximizes the lexical divergence between an original sentence s and its valid paraphrases by running a sequence of paraphrasing jobs carried out by a crowd of non-expert workers. Instead of collecting direct paraphrases of s, at each step of the sequence workers manipulate semantical...

متن کامل

A Class-oriented Approach to Building a Paraphrase Corpus

Towards deep analysis of compositional classes of paraphrases, we have examined a class-oriented framework for collecting paraphrase examples, in which sentential paraphrases are collected for each paraphrase class separately by means of automatic candidate generation and manual judgement. Our preliminary experiments on building a paraphrase corpus have so far been producing promising results, ...

متن کامل

Extracting Paraphrases from a Parallel Corpus

While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as sy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005